Skip to content

[Refactor] Add structured inference server config objects#3893

Draft
vmoens wants to merge 4 commits into
gh/vmoens/288/basefrom
gh/vmoens/288/head
Draft

[Refactor] Add structured inference server config objects#3893
vmoens wants to merge 4 commits into
gh/vmoens/288/basefrom
gh/vmoens/288/head

Conversation

[ghstack-poisoned]
@pytorch-bot

pytorch-bot Bot commented Jun 21, 2026

Copy link
Copy Markdown

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3893

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 8 New Failures

As of commit 39e3aa6 with merge base b660f05 (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@github-actions

github-actions Bot commented Jun 21, 2026

Copy link
Copy Markdown
Contributor

Benchmark Results: PR 39e3aa69 vs main 5cd8f5db

Benchmark run: https://github.com/pytorch/rl/actions/runs/28074360128

Higher ops/sec is better. Tables are sorted by largest absolute change.

CPU

Compared 216 benchmarks. Regressions over 5%: 4. Improvements over 5%: 19.

Benchmark main ops PR ops Change
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 400.75 2,220 +453.93%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 192.92 37.39 -80.62%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[False-backward] 54.39 88.62 +62.93%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 2,821 3,679 +30.44%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[True-backward] 205.91 253.69 +23.20%
benchmarks/test_envs_benchmark.py::test_cat_frames_functional[4-same] 24.64 29.26 +18.76%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2,534 2,882 +13.76%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3,062 2,642 -13.73%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1,913 2,153 +12.50%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[True-backward] 129.62 145.30 +12.10%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-backward] 375.98 419.39 +11.55%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 504.03 561.90 +11.48%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3,048 2,709 -11.11%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 50.88 56.45 +10.94%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[untyped_storage] 8.1855 8.9896 +9.82%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[True-backward] 105.84 115.26 +8.89%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2,712 2,953 +8.88%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-256-256-64] 10.91 10.09 -7.60%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-None] 1,665 1,776 +6.71%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[True-None] 280.26 298.25 +6.42%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[True-None] 82.01 87.06 +6.15%
benchmarks/test_envs_benchmark.py::test_simple 1.7041 1.8036 +5.84%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1,061 1,117 +5.29%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[False-backward] 27.40 28.71 +4.78%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-True-0-gru] 4.0767 4.2616 +4.54%
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sample_mixed_devices[1000000-memmap_cpu_storage_cpu... 80.20 83.65 +4.30%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[100-img_shape2-large_img] 569.83 545.81 -4.21%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[reduce-overhead-None] 281.12 292.23 +3.95%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-480-640-64] 6.5950 6.3368 -3.91%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-False-0-gru] 2.9198 3.0338 +3.90%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 759.22 730.15 -3.83%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-False-0-lstm] 1.9402 2.0113 +3.67%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[pil-256-256-1] 192.42 185.48 -3.61%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[False-None] 207.64 215.13 +3.61%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape2-large_img] 426.04 410.82 -3.57%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[False-None] 87.29 90.39 +3.55%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[True-None] 326.47 337.81 +3.47%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[numpy] 362,799 375,272 +3.44%
benchmarks/test_envs_benchmark.py::test_transformed 0.8822 0.9123 +3.41%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 641.45 663.21 +3.39%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb[100-img_shape0-atari] 25.64 26.50 +3.35%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-False-True] 41,216 42,582 +3.32%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[100-img_shape2-large_img] 175.12 169.38 -3.28%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[True-None] 275.36 284.36 +3.27%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-False-False] 49,804 51,431 +3.27%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-True-False] 28,945 29,881 +3.24%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 877.67 905.07 +3.12%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb[200-img_shape1-large_batch] 14.95 15.41 +3.09%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[True-backward] 58.78 60.56 +3.04%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape2-large_img] 403.77 391.61 -3.01%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 503.80 518.51 +2.92%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb[100-img_shape0-atari] 29.47 30.32 +2.87%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-False-True] 33,558 34,509 +2.83%
benchmarks/test_envs_benchmark.py::test_parallel 0.9727 0.9452 -2.82%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-False-False] 62,747 64,484 +2.77%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-True-True] 20,544 21,098 +2.69%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-True-True] 17,963 18,445 +2.69%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2,751 2,678 -2.67%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[50-img_shape0-small] 4,353 4,469 +2.67%
benchmarks/test_objectives_benchmarks.py::test_redq_deprec_speed[False-backward] 62.20 63.84 +2.64%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[False-backward] 130.03 133.45 +2.63%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[200-img_shape3-large_batch] 773.05 752.82 -2.62%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-single-True] 1.3613 1.3261 -2.59%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[50-img_shape0-small] 3,473 3,563 +2.58%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[reduce-overhead-None] 84.78 86.95 +2.56%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-backward] 964.13 988.78 +2.56%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[False-None] 175.83 180.27 +2.52%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[True-backward] 59.14 60.62 +2.51%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-True-False] 33,978 34,823 +2.49%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-None] 685.74 702.62 +2.46%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[False-backward] 82.37 84.39 +2.45%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-True-False] 41,254 42,263 +2.45%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-False] 58,158 56,741 -2.44%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-True] 32,991 32,191 -2.42%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[False-None] 49.37 50.56 +2.42%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-True-True] 19,264 19,711 +2.32%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-cudnn-True-0-gru] 1.4636 1.4298 -2.31%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 0.6032 0.5895 -2.28%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[reduce-overhead-None] 262.59 268.56 +2.28%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 537.50 549.65 +2.26%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[100-img_shape1-atari] 5,043 4,929 -2.26%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-False-True] 36,748 37,574 +2.25%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 24.30 24.84 +2.24%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 193.18 197.49 +2.23%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[100-img_shape1-atari] 270.39 276.39 +2.22%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-cudnn-False-0-lstm] 0.8696 0.8505 -2.19%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 0.5178 0.5291 +2.18%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[200-img_shape3-large_batch] 330.17 337.35 +2.17%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[reduce-overhead-None] 698.79 713.90 +2.16%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[200-img_shape3-large_batch] 307.62 314.18 +2.13%
benchmarks/test_collectors_benchmark.py::test_sync_preempt 16.63 16.27 -2.13%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[reduce-overhead-None] 334.11 341.21 +2.12%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[pil-256-256-64] 2.9770 3.0395 +2.10%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[True-backward] 123.74 126.31 +2.08%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[True-None] 545.92 557.17 +2.06%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 711.83 726.07 +2.00%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3,082 3,021 -2.00%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[False-backward] 505.83 515.89 +1.99%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape1-atari] 694.86 708.55 +1.97%
benchmarks/test_envs_benchmark.py::test_serial 0.5725 0.5834 +1.89%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[False-None] 37.59 38.30 +1.88%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-True-True] 23,580 24,018 +1.86%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-224-224-1] 636.89 648.50 +1.82%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-True-True] 21,845 22,240 +1.81%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2,807 2,757 -1.79%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[False-backward] 89.97 91.57 +1.79%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[pil-256-256-16] 11.94 12.15 +1.78%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 23.32 23.73 +1.77%
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sampler_sample_scale[1000000-cpu] 97.10 98.82 +1.77%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[False-backward] 77.85 79.22 +1.76%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 2,243 2,282 +1.74%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[False-backward] 32.81 33.37 +1.71%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[True-backward] 119.29 121.33 +1.71%
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 51.91 52.79 +1.70%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[True-backward] 280.22 284.98 +1.70%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 2,126 2,162 +1.69%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[100-img_shape1-atari] 637.74 648.45 +1.68%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2,012 2,045 +1.65%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_with_rb[200-img_shape1-large_batch] 13.26 13.48 +1.62%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[reduce-overhead-None] 569.48 578.53 +1.59%
... ... ... Showing 120 of 216 comparisons, sorted by absolute change.

GPU

Compared 226 benchmarks. Regressions over 5%: 13. Improvements over 5%: 20.

Benchmark main ops PR ops Change
benchmarks/test_replaybuffer_benchmark.py::test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 28.71 51.78 +80.37%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 190.98 48.49 -74.61%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3,589 2,613 -27.20%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3,599 2,721 -24.40%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2,574 3,153 +22.50%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3,350 2,610 -22.11%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 3,101 2,500 -19.39%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2,982 3,447 +15.58%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3,227 3,643 +12.90%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[100-img_shape1-atari] 3,657 4,103 +12.21%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape1-atari] 725.12 652.78 -9.98%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 732.06 798.79 +9.12%
benchmarks/test_objectives_benchmarks.py::test_dqn_speed[True-backward] 970.63 884.73 -8.85%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2,052 1,881 -8.32%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-True-True] 17,581 19,024 +8.21%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-False-False] 46,239 49,853 +7.82%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1,974 2,121 +7.45%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1,810 1,942 +7.31%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 2,036 2,181 +7.12%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 461.18 490.58 +6.38%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-False-True] 32,807 34,832 +6.17%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-True-False] 30,098 31,944 +6.13%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[True-backward] 351.53 331.58 -5.68%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 751.64 794.01 +5.64%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 0.4931 0.5207 +5.60%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[100-img_shape2-large_img] 401.50 379.35 -5.51%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 0.6682 0.7042 +5.39%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-False-0-gru] 22.45 21.27 -5.27%
benchmarks/test_objectives_benchmarks.py::test_ppo_speed[reduce-overhead-None] 780.80 821.24 +5.18%
benchmarks/test_objectives_benchmarks.py::test_sac_speed[True-backward] 326.56 309.73 -5.15%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-256-256-4] 165.11 156.66 -5.12%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-False-True] 40,653 42,712 +5.07%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-False-False] 42,710 44,861 +5.04%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[100-img_shape2-large_img] 528.38 502.56 -4.89%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-single-False] 1.5300 1.6022 +4.72%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 0.5700 0.5968 +4.70%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-True-False] 27,759 29,063 +4.70%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-False-True] 28,739 30,059 +4.59%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_lazystack[50-img_shape0-small] 4,358 4,163 -4.47%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-True-True] 20,116 21,002 +4.41%
benchmarks/test_objectives_benchmarks.py::test_reinforce_speed[False-None] 375.63 392.19 +4.41%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-True] 31,639 33,013 +4.34%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[untyped_storage] 8.1097 7.7623 -4.28%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[100-img_shape2-large_img] 160.47 167.33 +4.28%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-True-False] 30,575 31,832 +4.11%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-False-False] 53,223 55,336 +3.97%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-single-True] 1.2921 1.3428 +3.92%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[100-img_shape1-atari] 263.22 253.01 -3.88%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 160.48 166.67 +3.86%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 1,273 1,322 +3.84%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 161.05 167.23 +3.84%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[False-backward] 236.69 227.78 -3.77%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-224-224-4] 180.62 187.22 +3.65%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 156.93 162.44 +3.51%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 154.80 160.20 +3.49%
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sample_mixed_devices[1000000-memmap_cpu_storage_cud... 974.80 940.94 -3.47%
benchmarks/test_collectors_benchmark.py::test_single_pixels 6.0618 6.2711 +3.45%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[True-backward] 355.73 343.78 -3.36%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 162.92 168.24 +3.27%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[pil-256-256-4] 46.92 48.45 +3.26%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-True-True] 20,149 20,776 +3.11%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 504.76 520.10 +3.04%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-False-True] 27,409 28,223 +2.97%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-False-False] 61,820 63,653 +2.97%
benchmarks/test_replaybuffer_benchmark.py::test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 158.12 162.76 +2.94%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-True-False-False] 55,062 56,677 +2.93%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 496.98 482.48 -2.92%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-224-224-1] 616.10 634.04 +2.91%
benchmarks/test_objectives_benchmarks.py::test_values[vec_generalized_advantage_estimate-True-True] 302.74 293.99 -2.89%
benchmarks/test_replaybuffer_benchmark.py::test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 761.98 740.28 -2.85%
benchmarks/test_objectives_benchmarks.py::test_cql_speed[True-backward] 219.46 213.29 -2.81%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb[100-img_shape0-atari] 28.81 29.62 +2.79%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[numpy] 374,034 363,755 -2.75%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 939.29 963.63 +2.59%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-False-False] 62,909 64,499 +2.53%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[50-img_shape0-small] 839.68 860.64 +2.50%
benchmarks/test_envs_benchmark.py::test_serial 0.4151 0.4254 +2.48%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-True-True] 21,539 22,072 +2.47%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-False-True-True] 19,208 19,674 +2.43%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-False-True-False-True] 36,892 37,769 +2.38%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-False-False] 48,677 49,825 +2.36%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[50-img_shape0-small] 6,025 5,889 -2.26%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-False-True-False] 37,536 38,381 +2.25%
benchmarks/test_objectives_benchmarks.py::test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 1,283 1,254 -2.23%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 0.5872 0.6002 +2.21%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[pil-224-224-64] 4.4848 4.5817 +2.16%
benchmarks/test_objectives_benchmarks.py::test_a2c_speed[False-backward] 147.81 144.63 -2.16%
benchmarks/test_collectors_benchmark.py::test_sync_pixels 10.52 10.30 -2.15%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-True-False-True] 29,864 30,502 +2.14%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-None] 814.11 797.03 -2.10%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-False-False-True-False] 26,441 26,994 +2.09%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-480-640-4] 145.45 148.49 +2.09%
benchmarks/test_non_tensor_env_benchmark.py::test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 0.2097 0.2139 +1.99%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[pil-224-224-16] 17.93 18.27 +1.90%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_storage_write_contiguous[200-img_shape3-large_batch] 659.83 672.36 +1.90%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_stack_then_write[200-img_shape3-large_batch] 132.29 134.79 +1.89%
benchmarks/test_storage_write_benchmark.py::TestCollectorIntegrationBenchmark::test_collector_without_rb[200-img_shape1-large_batch] 14.69 14.96 +1.88%
benchmarks/test_collectors_benchmark.py::test_single_with_rb_pixels 5.3442 5.2439 -1.88%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[pil-224-224-4] 70.80 72.13 +1.87%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[True-True-True-False-False] 74,851 76,237 +1.85%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[pil-256-256-1] 190.27 193.78 +1.84%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[True-backward] 241.96 237.56 -1.82%
benchmarks/test_replaybuffer_benchmark.py::TestPrioritizedReplayBufferBenchmark::test_sample_mixed_devices[1000000-cuda_storage_cuda_samp... 1,481 1,455 -1.74%
benchmarks/test_rnn_reset_backends_benchmark.py::test_rnn_rollout_with_intermediate_resets[b256-t128-i32-h512-scan-True-0-gru] 49.15 48.29 -1.74%
benchmarks/test_envs_benchmark.py::test_step_mdp_speed[False-True-False-True-True] 19,088 19,418 +1.73%
benchmarks/test_envs_benchmark.py::test_cat_frames_functional[16-constant] 4,592 4,672 +1.73%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-256-256-1] 521.83 512.92 -1.71%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[pil-224-224-1] 281.91 286.58 +1.66%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[safetensors] 22,425 22,791 +1.63%
benchmarks/test_storage_write_benchmark.py::TestStorageWriteBenchmark::test_collector_lazystack_then_write[200-img_shape3-large_batch] 289.48 294.13 +1.61%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[torchvision-480-640-1] 472.80 465.42 -1.56%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[True-None] 736.09 724.77 -1.54%
benchmarks/test_objectives_benchmarks.py::test_ddpg_speed[True-backward] 452.48 445.58 -1.52%
benchmarks/test_replaybuffer_benchmark.py::test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 950.03 964.03 +1.47%
benchmarks/test_objectives_benchmarks.py::test_td3_speed[True-backward] 371.40 365.94 -1.47%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[pil-256-256-16] 12.02 12.19 +1.43%
benchmarks/test_vla_preprocessing_benchmark.py::test_openvla_preprocessing_throughput[pil-480-640-1] 77.56 78.63 +1.38%
benchmarks/test_objectives_benchmarks.py::test_iql_speed[False-backward] 67.35 68.27 +1.37%
benchmarks/test_compressed_storage_benchmark.py::TestCompressedStorageBenchmark::test_tensor_to_bytestream_speed[pickle] 11,773 11,932 +1.36%
benchmarks/test_collectors_benchmark.py::test_async_pixels 10.85 10.71 -1.33%
... ... ... Showing 120 of 226 comparisons, sorted by absolute change.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is worth a paragraph in the doc somewhere

vmoens added 3 commits June 22, 2026 09:18
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Collectors Documentation Improvements or additions to documentation Integrations/torch_geometric Integrations Modules Refactoring Refactoring of an existing feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant